95 research outputs found

    What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

    Full text link
    kNN-MT presents a new paradigm for domain adaptation by building an external datastore, which usually saves all target language token occurrences in the parallel corpus. As a result, the constructed datastore is usually large and possibly redundant. In this paper, we investigate the interpretability issue of this approach: what knowledge does the NMT model need? We propose the notion of local correctness (LAC) as a new angle, which describes the potential translation correctness for a single entry and for a given neighborhood. Empirical study shows that our investigation successfully finds the conditions where the NMT model could easily fail and need related knowledge. Experiments on six diverse target domains and two language-pairs show that pruning according to local correctness brings a light and more explainable memory for kNN-MT domain adaptation

    Variable-Based Fault Localization via Enhanced Decision Tree

    Full text link
    Fault localization, aiming at localizing the root cause of the bug under repair, has been a longstanding research topic. Although many approaches have been proposed in the last decades, most of the existing studies work at coarse-grained statement or method levels with very limited insights about how to repair the bug (granularity problem), but few studies target the finer-grained fault localization. In this paper, we target the granularity problem and propose a novel finer-grained variable-level fault localization technique. Specifically, we design a program-dependency-enhanced decision tree model to boost the identification of fault-relevant variables via discriminating failed and passed test cases based on the variable values. To evaluate the effectiveness of our approach, we have implemented it in a tool called VARDT and conducted an extensive study over the Defects4J benchmark. The results show that VARDT outperforms the state-of-the-art fault localization approaches with at least 247.8% improvements in terms of bugs located at Top-1, and the average improvements are 330.5%. Besides, to investigate whether our finer-grained fault localization result can further improve the effectiveness of downstream APR techniques, we have adapted VARDT to the application of patch filtering, where VARDT outperforms the state-of-the-art PATCH-SIM by filtering 26.0% more incorrect patches. The results demonstrate the effectiveness of our approach and it also provides a new way of thinking for improving automatic program repair techniques

    Continuous-Time Fixed-Lag Smoothing for LiDAR-Inertial-Camera SLAM

    Full text link
    Localization and mapping with heterogeneous multi-sensor fusion have been prevalent in recent years. To adequately fuse multi-modal sensor measurements received at different time instants and different frequencies, we estimate the continuous-time trajectory by fixed-lag smoothing within a factor-graph optimization framework. With the continuous-time formulation, we can query poses at any time instants corresponding to the sensor measurements. To bound the computation complexity of the continuous-time fixed-lag smoother, we maintain temporal and keyframe sliding windows with constant size, and probabilistically marginalize out control points of the trajectory and other states, which allows preserving prior information for future sliding-window optimization. Based on continuous-time fixed-lag smoothing, we design tightly-coupled multi-modal SLAM algorithms with a variety of sensor combinations, like the LiDAR-inertial and LiDAR-inertial-camera SLAM systems, in which online timeoffset calibration is also naturally supported. More importantly, benefiting from the marginalization and our derived analytical Jacobians for optimization, the proposed continuous-time SLAM systems can achieve real-time performance regardless of the high complexity of continuous-time formulation. The proposed multi-modal SLAM systems have been widely evaluated on three public datasets and self-collect datasets. The results demonstrate that the proposed continuous-time SLAM systems can achieve high-accuracy pose estimations and outperform existing state-of-the-art methods. To benefit the research community, we will open source our code at ~\url{https://github.com/APRIL-ZJU/clic}

    1st Place Solution of Egocentric 3D Hand Pose Estimation Challenge 2023 Technical Report:A Concise Pipeline for Egocentric Hand Pose Reconstruction

    Full text link
    This report introduce our work on Egocentric 3D Hand Pose Estimation workshop. Using AssemblyHands, this challenge focuses on egocentric 3D hand pose estimation from a single-view image. In the competition, we adopt ViT based backbones and a simple regressor for 3D keypoints prediction, which provides strong model baselines. We noticed that Hand-objects occlusions and self-occlusions lead to performance degradation, thus proposed a non-model method to merge multi-view results in the post-process stage. Moreover, We utilized test time augmentation and model ensemble to make further improvement. We also found that public dataset and rational preprocess are beneficial. Our method achieved 12.21mm MPJPE on test dataset, achieve the first place in Egocentric 3D Hand Pose Estimation challenge

    Extrapolating Large Language Models to Non-English by Aligning Languages

    Full text link
    Existing large language models show disparate capability across different languages, due to the imbalance in the training data. Their performances on English tasks are often stronger than on tasks of other languages. In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages. We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i.e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data. Then we perform multilingual instruction-tuning (MuIT) with mixed resources to build multilingual m-LLaMA. We also illustrate how we leverage the scaling laws to optimize data allocation in a resource-constrained setting. Experiment results on cross-lingual benchmarks XQUAD and MLQA show that x-LLaMAs surpass the English instruction-tuned counterpart (Alpaca) by an average of 27.83% across six non-English languages. Evaluation results on translation dataset Flores-101 show that x-LLaMAs outperform previous LLaMA-based models by an average of 18.89%. Encouragingly, m-LLaMA achieves comparable performance to x-LLaMAs on individual languages and demonstrates the ability to follow multilingual instructions. Further analysis on response content and representation space reveals the alignment of the multilingual semantic space within the middle layers of m-LLaMA

    Gut microbial biomarkers for the treatment response in first-episode, drug-naive schizophrenia: a 24-week follow-up study

    Get PDF
    Preclinical studies have shown that the gut microbiota can play a role in schizophrenia (SCH) pathogenesis via the gut-brain axis. However, its role in the antipsychotic treatment response is unclear. Here, we present a 24-week follow-up study to identify gut microbial biomarkers for SCH diagnosis and treatment response, using a sample of 107 first-episode, drug-naive SCH patients, and 107 healthy controls (HCs). We collected biological samples at baseline (all participants) and follow-up time points after risperidone treatment (SCH patients). Treatment response was assessed using the Positive and Negative Symptoms Scale total (PANSS-T) score. False discovery rate was used to correct for multiple testing. We found that SCH patients showed lower alpha-diversity (the Shannon and Simpson\u27s indices) compared to HCs at baseline (p = 1.21 x 10(-9), 1.23 x 10(-8), respectively). We also found a significant difference in beta-diversity between SCH patients and HCs (p = 0.001). At baseline, using microbes that showed different abundance between patients and controls as predictors, a prediction model can distinguish patients from HCs with an area under the curve (AUC) of 0.867. In SCH patients, after 24 weeks of risperidone treatment, we observed an increase of alpha-diversity toward the basal level of HCs. At the genus level, we observed decreased abundance of Lachnoclostridium (p = 0.019) and increased abundance Romboutsia (p = 0.067). Moreover, the treatment response in SCH patients was significantly associated with the basal levels of Lachnoclostridium and Romboutsia (p = 0.005 and 0.006, respectively). Our results suggest that SCH patients may present characteristic microbiota, and certain microbiota biomarkers may predict treatment response in this patient population

    Incidence and Etiology of Drug-Induced Liver Injury in Mainland China

    Get PDF
    Background & Aims: We performed a nationwide, retrospective study to determine the incidence and causes of drug-induced liver injury (DILI) in mainland China.Methods: We collected data on a total of 25,927 confirmed DILI cases, hospitalized from 2012 through 2014 at 308 medical centers in mainland China. We collected demographic, medical history, treatment, laboratory, disease severity, and mortality data from all patients. Investigators at each site were asked to complete causality assessments for each case whose diagnosis at discharge was DILI (n=29,478) according to the Roussel Uclaf Causality Assessment Method.Results: Most cases of DILI presented with hepatocellular injury (51.39%; 95% CI, 50.76–52.03), followed by mixed injury (28.30%; 95% CI, 27.73–28.87) and cholestatic injury (20.31%; 95% CI, 19.80–20.82). The leading single classes of implicated drugs were traditional Chinese medicines or herbal and dietary supplements (26.81%) and anti-tuberculosis medications (21.99%). Chronic DILI occurred in 13.00% of the cases and, although 44.40% of the hepatocellular DILI cases fulfilled Hy’s Law criteria, only 280 cases (1.08%) progressed to hepatic failure, 2 cases underwent liver transplantation (0.01%), and 102 patients died (0.39%). Among deaths, DILI was judged to have a primary role in 72 (70.59%), a contributory role in 21 (20.59%), and no role in 9 (8.82%). Assuming the proportion of DILI in the entire hospitalized population of China was represented by that observed in the 66 centers where DILI capture was complete, we estimated the annual incidence in the general population to be 23.80 per 100,000 persons (95% CI, 20.86–26.74). Only hospitalized patients were included in this analysis, so the true incidence is likely to be higher.Conclusions: In a retrospective study to determine the incidence and causes of drug-induced liver injury (DILI) in mainland China, the annual incidence in the general population was estimated to be 23.80 per 100,000 persons—higher than that reported from western countries. Traditional Chinese medicines, herbal and dietary supplements, and anti-tuberculosis drugs were the leading causes of DILI in mainland Chin
    • …
    corecore